Skip to content

Add fuzz testing for memo #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Add fuzz testing for memo #111

wants to merge 5 commits into from

Conversation

jopereira
Copy link

Problem

The memo data structure is too complex to come up with an exhaustive set of tests.

Summary of changes

Add fuzz testing based on the data generator and workload from https://github.com/jopereira/memobench. For now, this tests only logical expressions.

@AlSchlo
Copy link
Collaborator

AlSchlo commented May 23, 2025

I can't find the FuzzData, etc. structs. Could you please make the branch in cmu-db/optd?

@jopereira
Copy link
Author

Missed a file. It should be there now.

@AlSchlo
Copy link
Collaborator

AlSchlo commented May 23, 2025

It seems that the fuzzing code only creates one group that contains more than 1 expression.

Troubleshooting right now.

@jopereira
Copy link
Author

The default arguments are low. Try 10k groups and 100 expressions. It will however take some time to run and probably should be disabled by default.

@AlSchlo
Copy link
Collaborator

AlSchlo commented May 23, 2025

Indeed, for some reason though I cannot make it catch the previous regression (i.e. duplicate expressions in group).

@AlSchlo
Copy link
Collaborator

AlSchlo commented May 23, 2025

Notably:

    /// Takes the set of [`LogicalExpressionId`] that reference a group, mapped to their
    /// representatives.
    fn take_referencing_expr_set(&mut self, group_id: GroupId) -> HashSet<LogicalExpressionId> {
        self.group_referencing_exprs_index
            .remove(&group_id)
            .unwrap_or_default()
            .iter()
            .map(|id| self.repr_logical_expr_id.find(id))
            .collect()
    }

in helpers.rs when removing the remapping to representative logic SHOULD fail. And it does when I used to run the other testbench.

@jopereira
Copy link
Author

I'll check this today.

@jopereira
Copy link
Author

It was a typo in the test, discarding the result from shuffle method. This reproduces the bug when commenting out the remapping.

@AlSchlo
Copy link
Collaborator

AlSchlo commented May 27, 2025

I still cannot replicate it.

/// Takes the set of [`LogicalExpressionId`] that reference a group, mapped to their
/// representatives.
fn take_referencing_expr_set(&mut self, group_id: GroupId) -> HashSet<LogicalExpressionId> {
    self.group_referencing_exprs_index
        .remove(&group_id)
        .unwrap_or_default()
}

Was this the change you tried out?

@AlSchlo
Copy link
Collaborator

AlSchlo commented May 27, 2025

I believe it might be not the correct seed. However, even when trying out the old memobench, I cannot replicate it unless I use your seed (I even ran it for one hour to find another seed, but cannot).

@jopereira
Copy link
Author

Yes, that's what I'm testing and getting it to fail with pretty much any seed. Let me try with a clean checkout of this branch just to be sure.

@jopereira
Copy link
Author

It still works for me, i.e. reproduces the problem with a fresh clone of this PR. StdRng is not supposed to be fully reproducible on different systems, but it is so easy to reproduce that I find it hard to believe that it is that. Can you help me reproduce your testing environment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants